Knowledge-Graph- and GCN-Based Domain Chinese Long Text Classification Method
نویسندگان
چکیده
In order to solve the current problems in domain long text classification tasks, namely, length of a document, which makes it difficult for model capture key information, and lack expert knowledge, leads insufficient accuracy, based on knowledge graph convolutional neural network is proposed. BERT used encode text, each word’s corresponding vector as node so that initialized contains rich semantic information. Using trained entity–relationship extraction model, entity-to-entity–relationships document are extracted edges network, together with syntactic dependency The structure mask learn about edge relationships types further enhance learning ability dependencies between words. method improves accuracy by fusing features data features. Experiments three datasets—IFLYTEK, THUCNews, Chinese corpus Fudan University—show improvements 8.8%, 3.6%, 2.6%, respectively, relative model.
منابع مشابه
Chinese Short Text Classification Based on Domain Knowledge
People are generating more and more short texts. There is an urgent demand to classify short texts into different domains. Due to the shortness and sparseness of short texts, conventional methods based on Vector Space Model (VSM) have limitations. To tackle the data scarcity problem, we propose a new model to directly measure the correlation between a short text instance and a domain instead of...
متن کاملSome Studies on Chinese Domain Knowledge Dictionary and Its Application to Text Classification
In this paper, we study some issues on Chinese domain knowledge dictionary and its application to text classification task. First a domain knowledge hierarchy description framework and our Chinese domain knowledge dictionary named NEUKD are introduced. Second, to alleviate the cost of construction of domain knowledge dictionary by hand, we use a boostrapping-based algorithm to learn new domain ...
متن کاملImproved Graph Based K-NN Text Classification
This paper presents an improved graph based k-nn algorithm for text classification. Most of the organization are facing problem of large amount of unorganized data. Most of the existing text classification techniques are based on vector space model which ignores the structural information of the document which is the word order or the co-occurrences of the terms or words. In this paper we have ...
متن کاملDomain Based Classification of Punjabi Text Documents
With the dramatic increase in the amount of content available in digital forms gives rise to a problem to manage this online textual data. As a result, it has become a necessary to classify large texts (documents) into specific classes. And Text Classification is a text mining technique which is used to classify the text documents into predefined classes. Most text classification techniques wor...
متن کاملA Knowledge-based Approach to Text Classification
The paper presents a simple and effective knowledge-based approach for the task of text classification. The approach uses topic identification algorithm named FIFA to text classification. In this paper the basic process of text classification task and FIFA algorithm are described in detail. At last some results of experiment and evaluations are discussed.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2023
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app13137915